ON THE CAPACITY ACHIEVING TRANSMIT COVARIANCE MATRICES OF MIMO 
CORRELATED RICIAN CHANNELS: A LARGE SYSTEM APPROACH 



Julien Dumont^^, Philippe Loubaton^ and Samson Lasaulce^ 



O 

o 

(N 
< 

in 



C/2 



> 

o 
m 
o 
\o 
o 

o 



X 

J3 



^France Telecom R&D, 38-40 Rue du General Leclerc 92794 Issy-les-Moulineaux Cedex 9, France 
HGM Lab. Info., UMR-CNRS 8049, Universite de Marne la Vallee, 77454 Marne-la-Vallee, France 
3 CNRS-LSS, 5, Rue Joliot Curie, 91 192 Gif-sur-Yvette, France 
E-mail: {dumont , loubaton}@univ-mlv . fr,lasaulce@lss . supelec . f r 



ABSTRACT 

We determine the capacity-achieving input covariance ma- 
trices for coherent block-fading correlated MIMO Rician 
channels. In contrast with the Rayleigh and uncorrelated 
Rician cases, no closed-form expressions for the eigenvec- 
tors of the optimum input covariance matrix are available. 
Both the eigenvectors and eigenvalues have to be evaluated 
by using numerical techniques. As the corresponding opti- 
mization algorithms are not very attractive, we evaluate the 
limit of the average mutual information when the number of 
transmit and receive antennas converge to +oo at the same 
rate. We propose an attractive optimization algorithm of 
the large system approximant, and establish some conver- 
gence results. Numerical simulation results show that, even 
for a quite moderate number of transmit and receive anten- 
nas, the new approach provides the same results than direct 
maximization approaches of the average mutual informa- 
tion, while being much more computationally attractive. 

1. INTRODUCTION 

Since the seminal work of Telatar (O), it is widely recog- 
nized that the use of multiple antennas at both the transmit- 
ter and the receiver has the potential to increase the capacity 
of digital communication systems. However, to take benefit 
of the potential of MIMO systems, it is necessary to adapt 
the transmitter to the channel in some optimal way. In the 
context of the so-called block-fading channel, the channel 
matrix is generally modelled as a random complex Gaus- 
sian matrix, and one of the most popular figure of merit is 
the ergodic capacity defined as the maximum over the input 
covariance matrices of the average mutual information. It is 
in general reasonnable to assume that the mean and the co- 
variance of the channel are available at the transmitter side. 
Therefore, the average mutual information can, in principle, 
be evaluated and optimized w.r.t. the input covariance ma- 
trix at the transmitter side. 



This optimization problem has been addressed exten- 
sively in the case of certain Rayleigh channels. In the con- 
text of the so-called Kronecker model, it has been shown by 
various authors (see e.g. for a review) that the eigenvec- 
tors of the optimal input covariance matrix coincide with 
the eigenvectors of the transmit correlation matrix. It is 
therefore sufficient to evaluate the eigenvalues of the op- 
timal matrix, a problem which can be solved by using stan- 
dard optimization algorithms. Note that 1171 extended this 
result to more general (non Kronecker) Rayleigh channels. 
Rician channels have been comparatively less studied from 
this point of view. We mention the work 1101 devoted to 
the case of uncorrelated Rician channels. |10| proved that 
the eigenvectors of the optimal input covariance matrix are 
the right-singular vectors of the line of sight component of 
the channel. As in the Rayleigh case, its eigenvalues can 
be evaluated by standard routines. The case of correlated 
Rician channels is undoubtly more complicated because the 
eigenvectors of the optimum matrix have no closed form ex- 
pressions. Therefore, both its eigenvalues and its eigenvec- 
tors have to be evaluated numerically. For this, it is neces- 
sary to use numerical methods: see in particular |19| where 
a barrier interior-point method has been implemented. The 
corresponding algorithms are however not very attractive 
because the exact expression of the average mutual infor- 
mation is quite complicated ( 11 lii ). Therefore, its gradient 
and its Hessian have rather to be evaluated using computa- 
tionally intensive Monte-Carlo simulation methods. 

In this paper, we address the optimization of the input 
covariance of bi-correlated Rician channels. As the exact 
expression of the average mutual information is quite com- 
phcated, we propose to evaluate its limit when the num- 
ber of transmit and receive antennas converge to +oo at the 
same rate, and to address the optimization of its asymptotic 
approximation, hopefully a simpler problem. The asymp- 
totic expression of the mutual information has been obtained 
by various authors in the case of MIMO Rayleigh channels, 
and has been shown to be quite reliable even for a quite 



moderate number of antennas; see e.g. |3l, 1181 in which 
large random matrix results have been used, 1141 which uses 
the non rigorous, but useful, replica method. In our knowl- 
edge, the asymptotic analysis of Rician channels has been 
considered in (4) (using a result of Girko Q valid in the 
context of restrictive assumptions) and 1151 (using the replica 
method) in the uncorrected case and in jSj in the case of re- 
ceive correlated Ricean channels. In this paper, we use the 
recent results of |9| in which a closed form asymptotic ap- 
proximation of the mutual information is provided, and state 
without proof new results concerning its accuracy. Then, 
we address the optimization of the large system approxi- 
mation w.r.t. the input covariance matrix. As the average 
mutual information, the corresponding function is strictly 
concave. We propose a simple iterative maximization algo- 
rithm, which, in some sense, can be seen as a generaliza- 
tion to the Rician case of proposal of 1201 devoted to the 
Rayleigh context: each iteration needs to solve a system 
of 2 non linear equations as well as a standard waterfill- 
ing problem. In contrast with 8201 . we give some conver- 
gence results: we prove that, if convergent, then the algo- 
rithm converges toward the optimum input covariance ma- 
trix. Finally, simulation results confirm the relevance of our 
approach. 

This paper is organized as follows. Section|2lis devoted 
to the presentation of the model and of the underlying as- 
sumptions. Section |3] presents our asymptotic approxima- 
tion of the average mutual information. Sectionl^is devoted 
to the maximization of our mutual information approxima- 
tion. Finally, simulation results are provided in Section|5] 

2. PRESENTATION OF THE CHANNEL MODEL 

We consider a block fading MIMO static channel and de- 
note by n and N the number of transmit and receive an- 
tennas respectively. The N x n channel matrix, denoted 
S, is supposed to be given by S = A + Y. Y is a zero 
mean N x n complex Gaussian random matrix (sometimes 
called complex circular Gaussian random matrix) given by 
Y = ^Ri/2XTi/2 where R and T are the receive and 

V" 

transmit correlation matrices, and where X is a zero mean 
independent identically distributed complex Gaussian ma- 
trix in the sense that the real and imaginary parts of the 
entries of X are independent, and have the same variance 
A represents a deterministic N xn matrix. Very often, 
A is assumed to be a rank one matrix (see e.g. Js], 1121 '). 
However, in important contexts, this hypothesis is not valid. 
Macro diversity downlink transmissions are typical exam- 
ples in which A is likely to be full rank. In this context, 
transmit antennas are very far from each other, while the 
distance between the receive antennas are of the order of 
the wavelength of the transmitted signals. In such a context. 



the line of sight components between each transmit antenna 
and the receive antenna arrays are different, so that A is 
likely to be full rank. If the receive antennas array is linear 
and uniform, a typical example for A is 



K 



K+1 



where a.{6) 
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[a(0i),. 



(1) 



and A is a diag- 
onal matrix, the entries of which represent the complex am- 
plitudes of the n line of sight components. < K < +oo 
is the so-called Rice factor of the channel. In the following, 
we therefore do not formulate any assumption on the rank of 
A. Finally, matrices A, R, T are normalized in such a way 
that iTr(R) = iTr(T) = ^Tr(AA^) = 



K 

K+1 



where < A' < 



oo is the Rice factor of the channel. 



3. ASYMPTOTIC BEHAVIOUR OF THE AVERAGE 
MUTUAL INFORMATION. 

In the following, we denote by 6 the cone of non negative 
Hermitian nxn matrices, and by Ci the subset of all matri- 
ces Q of 6 for which iTr(Q) = 1. Let Q be an element of 
Ci. Let (7^ be a fixed noise level. Then, we denote by /(Q) 
the average mutual information at the noise level given 



/(Q) =E 



log det I 



(2) 



As it is well known, the ergodic capacity of the channel 
is defined as 

Ce = max /(Q) (3) 

The optimal input covariance matrix thus coincides with the 
argument of the above maximization problem. Note that 
function Q ^(Q) is strictly concave while the set Ci on 
which it is defined is convex. Therefore (fTSl), the maxi- 
mum of / on Ci is reached in a unique point. 

If R = I and T = I, it is shown in 1101 that the eigen- 
vectors of the optimal input covariance matrix coincides 
with the right-singular vectors of A. Apart this simple case, 
it seems difficult to characterize in closed form the eigen- 
vectors of the optimal matrix. Therefore, its evaluation re- 
quires to use numerical technics (see 1191 ). This approach is 
complicated by the fact that the expression of function /(Q) 
is quite complicated ni II V Therefore, its gradient and Hes- 
sian have to be evaluated using Monte Carlo simulations. 
In the asymptotic regime N +oo, n +oo in such a 
way that ^ —> a where < a < +oo, /(Q) turns out to 
be equivalent to a much simpler term. The purpose of this 
section is to review the corresponding asymptotic results. In 
order to simplify the notations, the symbol n +oo should 
be understood from now on as n and N converge to +00 in 
such a way ^ a. 



/(Q) coincides with the average mutual information of 
the virtual channel 

= + ^ri/2xu(Q1/2tqi/2)1/2 



where matrix U is the constant n x n unitary matrix U = 
T1/2qi/2(qi/2tqi/2)-i/2 XU has the same statis- 
tical properties than X, it appears that SQ^/^ can be in- 
terpreted as a bi-correlated Gaussian Rician channel with 
mean AQi/2 and receive and transmit correlation matrices 
R and Q1/2'j'qi/2 respectively. In the following, we de- 
note by T(Q) the matrix T(Q) = Q1/2tqi/2, !„ order to 
derive an asymptotic approximation of /(Q), it is therefore 
possible to use the results of ||9|- We note that the results 
of ||9l are obtained if matrices R and Q^/2tQ^/^ are di- 
agonal. The unitary invariance of the mutual information 
of Gaussian random matrices allows however to use these 
results. We first state the following result, which derives 
partly from (O) . 

Theorem 1 Assume that sup„ || Aj| < +00, sup„ ||R|| < 
+00, sup„ ||T|| < +00, and sup „ ||Q|| < +00 where \\.\\ 
stands for the spectral norm. Consider the system of equa- 
tions 

K = /(k, K, Q) 

k = Q) 



(4) 



where /(k, k, Q) is given by 

-Tr R (^2(1 + Rk) + AQi/2(i + T{q,)n)-^Qi'/'^ A" 
n \_ \ 

(5) 

and f{K, k, Q) by 



1 

-Tr 

n 



T(Q) i^cr^il + T{Q)k) + Q^^^A"{I + Rky^AQ^ 

(6) 

Then, equations have unique strictly positive solutions 
((5(Q), (5(Q)). Moreover, when n —>■ +00, 

/(Q)=7(Q) + 0(l/n) (7) 

where the asymptotic approximant /(Q) is defined by 



I{Q)^logdet I + H(Q)QH(Q) 



log det 



I + ^(Q)R -ah2diQ)d{Q) 



(8) 



where H(Q) represents the n x n positive definite ' matrix 
defined by 



H(Q) 



1 



J(Q)T + — A^(I + ^(Q)R)-iA 



-,1/2 



(9) 



The proof of this result is far from being obvious, and is 
of course omitted. It is partly based on the results of JO), 
from which one can deduce that /(Q) = ^(Q) + o(n). 
The fact that /(Q) - T(Q) = 0{l/n) is not obvious at 
all, and follows specifically from the fact that matrix S has 
a Gaussian complex distribution. In particular, in the Gaus- 
sian real case, /(Q) — /(Q) = 0(1). This in accordance 
with im in which a similar result is proved in the simpler 
context A = and Q = I, and with the predictions of the 
replica method in jl4| in the case A = and in the 
case R = I, T = I and Q = I. This very fast conver- 
gence rate tends to explain why the asymptotic evaluations 
of the mean mutual information are reliable even for a quite 
moderate number of antennas, as remarked e.g. in 0. See 
Section|5]for simulation evidence. 

We end this section by a very useful remark. Consider 
the function k, Q) defined by replacing in (|8|i solu- 
tions (<5(Q), <5(Q)) of © by fixed parameters (k, k): 



K, Q) = logdet I + G(k, k)QG(k, 
+ log det [I + kR] — a'^nnk 



(10) 



where G(k, k) represents the n x n positive definite matrix 
defined by 



G(k, k) 



1 



kT+ — A^(I + kR)-1A 



1/2 



(11) 



We of course note that H(Q) = G((5(Q), (S(Q), Q) and 
/(Q) = F(5(Q), J(Q), Q). It is straightforward to check 
that 



dV_ 

'dk 

dV_ 



na'^ [k - f{K,k,Q) 
na^ ( k — f{K, K, Q) 



(12) 



As ((5(Q), ^(Q)) satisfy Eq. 0, we get immediately that 

= 



f ) 

(5(Q),5(Q).Q) 



(13) 



'H(Q) is positive definite because 5(Q) > 



This simple observation is the key point of our input covari- 
ance optimization algorithm. 

4. THE INPUT COVARIANCE OPTIMIZATION 
ALGORITHM. 

The results of Section |3] show that /(Q) can be approxi- 
mated with a good accuracy by /(Q). Therefore, the op- 
timum input covariance matrix can itself be approximated 



by the argument of the maximum of /(Q) over the set Ci. 
In this section, we propose an attractive maximization algo- 
rithm of /(Q). Before presenting the algorithm, we have to 
introduce some concepts and results. 

Definition 1 Let VF(Q) be a function defined on Ci. If 
Q, P are 2 elements of Ci, then W is said to be differen- 
tiable in the Gateaux sense at point Q in the direction P — Q 
if the limit 



Q + A(P-Q) -VK(Q) 
lim ^ ^ ^ (14) 

A->0+ A 

exists. In this case, this limit is denoted < Ty(Q),P — Q >. 

Note that for each A e [0, 1], matrix Q + A(P - Q) = 
(1 — A)Q + AP of course belongs to Ci. Therefore, 

W + A(P — Q)^ makes sense for A > small enough. 

Proposition 1 Let W be a strictly concave function defined 
on Ci. Then, the maximum of W on Ci is reached at a 
unique point Q* o/Ci. Assume that for every elements Q, P 
of Ci, W is differentiable in the Gateaux sense at point Q 
in the direction P — Q. Then, Q, is the unique element of 
Ci verifying 



< M^'(Q,),Q-Q* ><0 



(15) 



for each element Q o/Ci. 



This result is a simple adaptation of known results (see e.g. 
1131 '). The proof is therefore omitted. We now give some 
useful properties of function /. 

Proposition 2 Function /(Q) is strictly concave on Ci. More- 
over, for every elements Q, P o/Ci, / is differentiable in the 
Gateaux sense at point Q in the direction P — Q. 

The fact that / is Gateaux differentiable is rather obvious. 
The strict concavity of / needs some work, but is not sur- 
prising because it is an approximant of a strictly concave 
function. 

PropositionQlthus implies that the maximum of / on Ci 
is reached at a unique point denoted Q,. Before presenting 
our maximization algorithm of /, we first give some insights 
on the structure of matrix Q*. For this, we denote <5(Q*) 
and (5(Q*) by (5* and 6■^, respectively. Then, we have the 
following result. 

Proposition 3 Matrix is the solution of the standard 
Water-Filling problem: Maximize over Q G Ci the function 



UiQ) = logdet I + G((5*,(5*)QG('5*,5*) 



where G{S,J,) = (5*T + ^A^(I + 5,R)"^A 



Proof. The proof of this result is based on the following 
identity, to be proved below: 



< / (Q,), Q - >=< V [S,.,S,.M*) , Q - Q* > 

(16) 

for each Q e Ci, where < ((5*, Q*), Q — Q* > repre- 
sents the Gateaux differential of function Q V{6^,,S^,,Cl). 
In effect, if il6\ holds, then, Proposition^implies that 

< V ((5,,(5*,Q,) ,Q-Q, ><0 

for each Q e Ci. By Proposition [2 Q* maximizes the 
function Q — *■ V^((5», (5», Q), i.e. C/(Q) because the latter 
functions differ up to a constant term. It remains to prove 
il6\ . For this, we remark that, by (I13> . 



/ (<5.i.,Q,) 
^'^ J (5.,5.,Q.) 

On the other hand, for each Q, P, 







(17) 



< / (P),Q - P >=< r(5(P),5(P),P),Q - P) > 
(f^)wP)i(P).P)<'^'(P)'Q-P> + 



K Oft 7(5(p)i(P),p; 



<<5'(P),Q-P> 



1/2 



(18) 

where < 5'(P), Q - P > and < <5'(P), Q - P > represent 
the Gateaux differentials of functions 5 and 5. Eq. Ml\ thus 
implies ( fT6l . 

(5* and 6^, depend on matrix Q*. Therefore, Proposi- 
tion|3]does not provide by itself any optimization algorithm. 
However, it gives insights on the structure of Q,. Consider 
first the case R = I and T = I. Then, G((5*, (5,) is a linear 
combination of I and matrix A^A. The eigenvectors of 
thus coincide with the right singular vectors of matrix A, a 
result consistent with the work 1101 devoted to the maxi- 
mization of the average mutual information /(Q). If R = I 
and T 7^ I, G((5, , (5*) can be interpreted as a linear com- 
bination of matrices T and A^A. Therefore, if the trans- 
mit antennas are correlated, the eigenvectors of the optimum 
matrix Q» coincide with the eigenvectors of some weighted 
sum of T and A^A. This result provides a simple expla- 
nation of the impact of correlated transmit antennas on the 
structure of the capacity-achieving input covariance matrix. 
The effect of correlated receive antennas on Q, is however 
less intuitive because matrix A^A has to be replaced by 
A^(I + ^,R)-iA. 

We are now in position to introduce our maximization 
algorithm of /. It is mainly motivated by the simple obser- 
vation that for each fixed (k, k), the maximization w.r.t. Q 



of function V{k,k, Q) defined by ^ can be achieved by 
a standard Waterfilling procedure, which, of course, does 
not need the use of numerical technics. On the other hand, 
for Q fixed, the equations (0} have unique solutions that, in 
practice, can be obtained using a standard fixed-point algo- 
rithm. Our algorithm thus consists in adapting parameters 
Q and S, S separately by the following iterative scheme: 

• Initialization: Qo — I, {Si, Si) are defined as the 
unique solutions of system (|4} in which Q = Qo = I. 
Then, define Qi are the maximum of function Q 

ViSiJi,Q) on Ci. 

• Iteration fc: assume Qfc_i, {Sk~i,Sk~i) available. Then, 
{Sk,Sk) is defined as the unique solution of @ in 
which Q = Qfc_i. Then, define are the maxi- 
mum of function Q V{Sk,Sk,Q) on Ci. 

We now study the convergence properties of this algorithm, 
and state a result, which implies that if the algorithm con- 
verges, then it converges to the global maximum of /. 

Proposition 4 Assume that the 2 sequences {Sk)k>o cind 
{Sk)k>o verify 





n = N = 2 


n = TV = 4 


n = TV = 8 


Vu-Paulraj 


0.75 


8.2 


138 


New algorithm 


10"^ 


3.10-^ 


7.10-^ 



lim Sk - Sk-i 

k — > + oo 



0, lim 4-4-1^0 (19) 

k — > + oo 



Then, the sequence (Qfc)fc>o converges toward the maxi- 
mum Q, of I on Ci. 

Due to the lack of space, the proof is omitted. 

Proposition]?] implies that if the sequence (Qfc)fe>o is 
convergent, then, its limit coincides with the optimum ma- 
trix Qh.. In fact, if (Qa;)a:>o converges, then the 2 sequences 
{Sk)k>Q, {Sk)k>o also converge. This of course implies con- 
dition ( I19> , and the convergence of (Qfc)fc>o toward Q*. 

Unfortunately, we have not been able to prove the con- 
vergence of (Qfc)fe>o by itself. However, all the numerical 
experiments we have conducted tend to indicate that the al- 
gorithm is convergent. In any case, condition il9i is very 
easy to verify during the algorithm execution. In case of 
non convergence, other numerical technics could be used in 
order to optimize /(Q), a simpler task than the optimization 
of/(Q). 



Fig. 1. Average time per iteration in seconds 



for 71 = TV = 4. Matrix H coincides with the example 
considered in II19I . The solid line corresponds to the re- 
sults provided by the Vu-Paulraj 's algorithm; the number of 
trials used to evaluate the mutual informations and its first 
and second derivatives is equal to 30.000, and the maximum 
number of iterations is fixed to 10. The dashed line corre- 
sponds to the results provided by our algorithm: each point 
represent /(Q,) at the corresponding SNR, where is 
the "optimal" matrix provided by our approach; the average 
mutual information at point is evaluted by Monte-Carlo 
simulation (30.000 trials are used). The number of iterations 
is also limited to 10. Figure |2] shows that our asymptotic 
approach provides the same results than the Vu-Paukaj's al- 
gorithm. However, our algorithm is computationally much 
more efficient as the above table shows. The table gives the 
average executation time (in sec.) of one iteration for both 
algorithms for n = TV = 2, n = TV = 4, n = TV = 8. 

In fig. |3] we again compare Vu-Paulraj 's algorithm and 
our proposal. Matrix A is generated according to 0, the 
angles being chosen at random. The transmit and receive 
antennas correlations are exponential with parameter < 
pt < I and < pr < 1 respectively. In the experiments, 
n = TV = 4, while various values of pt, pr and of the 
Rice factor K have been considered. As in the previous 
experiment, the maximum number of iterations for both al- 
gorithms is 10, while the number of trials generated to eval- 
uate the average mutual informations and their derivatives is 
equal to 30.000. Our approach again provides the same re- 
sults than Vu-Paukaj's algorithm, except for low SNRs for 
K = 1, Pt = 0.5, Pr = 0.8 where our method gives better 
results: at these points, the Vu-Paukaj's algorithm seems 
not to have converge at the 10th iteration. 

6. CONCLUSION 



5. COMPARISON WITH THE VU-PAULRAJ'S 
ALGORITHM. 

In this section, we compare our algorithm with the method 
presented in ||19l based on the maximization of /(Q). We 
recall that Vu-Paulraj 's algorithm is based on a Newton method 
and a barrier interior point method. Moreover, the aver- 
age mutual informations and their first and second deriva- 
tives are evaluated by Monte-Carlo simulations. In fig. |2] 
we have evaluated Ce = ^'"^^Qeei ^(Q) versus the SNR 



In this paper we proposed a new approach to characterize 
the capacity achieving covariance matrix of bi-correlated 
Rician MIMO channels. We proposed to approximate the 
average mutual information by its large system limit and 
derived an attractive iterative optimization algorithm which 
does not need the use of intricate numerical techniques. We 
have shown that the algorithm (when it is convergent) con- 
verges to the maximum of the approximate mutual informa- 
tion. Numerical simulation results show that the new ap- 
proach provides the same results than direct maximization 
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Fig. 3. Comparison with the Vu-Paulraj's algorithm II 

approaches of the mutual information, while being much 
more computationally attractive. 
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